Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add filters

Language
Document Type
Year range
1.
biorxiv; 2023.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.10.28.564530

ABSTRACT

Understanding how the genome of a virus evolves depending on the host it infects is an important question that challenges our knowledge about several mechanisms of host-pathogen interactions, including mutational signatures, innate immunity, and codon optimization. A key facet of this general topic is the study of viral genome evolution after a host-jumping event, a topic which has experienced a surge in interest due to the fight against emerging pathogens such as SARS-CoV-2. In this work, we tackle this question by introducing a new method to learn Maximum Entropy Nucleotide Bias models (MENB) reflecting single, di- and tri-nucleotide usage, which can be trained from viral sequences that infect a given host. We show that both the viral family and the host leave a fingerprint in nucleotide usages which MENB models decode. When the task is to classify both the host and the viral family for a sequence of unknown viral origin MENB models outperform state of the art methods based on deep neural networks. We further demonstrate the generative properties of the proposed framework, presenting an example where we change the nucleotide composition of the 1918 H1N1 Influenza A sequence without changing its protein sequence, while manipulating the nucleotide usage, by diminishing its CpG content. Finally we consider two well-known cases of zoonotic jumps, for the H1N1 Influenza A and for the SARS-CoV-2 viruses, and show that our method can be used to track the adaptation to the new host and to shed light on the more relevant selective pressures which have acted on motif usage during this process. Our work has wide-ranging applications, including integration into metagenomic studies to identify hosts for diverse viruses, surveillance of emerging pathogens, prediction of synonymous mutations that effect immunogenicity during viral evolution in a new host, and the estimation of putative evolutionary ages for viral sequences in similar scenarios. Additionally, the computational frame-work introduced here can be used to assist vaccine design by tuning motif usage with fine-grained control. Author summaryIn our research, we delved into the fascinating world of viruses and their genetic changes when they jump from one host to another, a critical topic in the study of emerging pathogens. We developed a novel computational method to capture how viruses change the nucleotide usage of their genes when they infect different hosts. We found that viruses from various families have unique strategies for tuning their nucleotide usage when they infect the same host. Our model could accurately pinpoint which host a viral sequence came from, even when the sequence was vastly different from the ones we trained on. We demonstrated the power of our method by altering the nucleotide usage of an RNA sequence without affecting the protein it encodes, providing a proof-of-concept of a method that can be used to design better RNA vaccines or to fine-tune other nucleic acid-based therapies. Moreover the framework we introduce can help tracking emerging pathogens, predicting synonymous mutations in the adaptation to a new host and estimating how long viral sequences have been evolving in it. Overall, our work sheds light on the intricate interactions between viruses and their hosts.

2.
ssrn; 2020.
Preprint in English | PREPRINT-SSRN | ID: ppzbmed-10.2139.ssrn.3611280

ABSTRACT

SARS-CoV-2 infection can lead to acute respiratory syndrome in patients, which can be due in part to dysregulated immune signalling. We analyze here the occurrences of CpG dinucleotides, which are putative pathogen-associated molecular patterns, along the viral sequence. Carrying out a comparative analysis with other ssRNA viruses and within the Coronaviridae family, we find the CpG content of SARS-CoV-2, while low compared to other betacoronaviruses, widely fluctuates along its primary sequence. While the CpG relative abundance and its associated CpG force parameter are low for the spike protein (S) and comparable to circulating seasonal coronaviruses such as HKU1, they are much greater and comparable to SARS and MERS for the 3'-end of the viral genome. In particular, the nucleocapsid protein (N), whose transcripts are relatively abundant in the cytoplasm of infected cells and present in the 3'UTRs of all subgenomic RNA, has high CpG content. We speculate this dual nature of CpG content can confer to SARS-CoV-2 high ability to both enter the host and trigger pattern recognition receptors (PRRs) in different contexts. We then investigate the evolution of synonymous mutations since the outbreak of the COVID-19 pandemic. Using a new application of selective forces on dinucleotides to estimate context driven mutational processes, we find that synonymous mutations seem driven both by the viral codon bias and by the high value of the CpG force in the N protein, leading to a loss in CpG content. Sequence motifs preceding these CpG-loss-associated loci match recently identified binding patterns of the Zinc Finger anti-viral Protein (ZAP) protein.Funding: This work was partially supported by the ANR19 Decrypted CE30-0021-01 grants. B.G. was supported by National Institutes of Health grants 7R01AI081848-04, 1R01CA240924-01, a Stand Up to Cancer – Lustgarten Foundation Convergence Dream Team Grant, and The Pershing Square Sohn Prize – Mark Foundation Fellow supported by funding from The Mark Foundation for Cancer Research.


Subject(s)
Severe Acute Respiratory Syndrome , COVID-19
3.
biorxiv; 2020.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2020.05.06.074039

ABSTRACT

COVID-19 can lead to acute respiratory syndrome, which can be due to dysregulated immune signaling. We analyze the distribution of CpG dinucleotides, a pathogen-associated molecular pattern, in the SARS-CoV-2 genome. We find that the CpG content, which we characterize by a force parameter that accounts for statistical constraints acting on the genome at the nucleotidic and amino-acid levels, is, on average, low compared to other pathogenic betacoronaviruses. However, the CpG force widely fluctuates along the genome, with a particularly low value, comparable to the circulating seasonal HKU1, in the spike coding region and a greater value, comparable to SARS and MERS, in the highly expressed nucleocapside coding region (N ORF), whose transcripts are relatively abundant in the cytoplasm of infected cells and present in the 3UTRs of all subgenomic RNA. This dual nature of CpG content could confer to SARS-CoV-2 the ability to avoid triggering pattern recognition receptors upon entry, while eliciting a stronger response during replication. We then investigate the evolution of synonymous mutations since the outbreak of the COVID-19 pandemic, finding a signature of CpG loss in regions with a greater CpG force. Sequence motifs preceding the CpG-loss-associated loci in the N ORF match recently identified binding patterns of the Zinc finger Anti-viral Protein. Using a model of the viral gene evolution under human host pressure, we find that synonymous mutations seem driven in the SARS-CoV-2 genome, and particularly in the N ORF, by the viral codon bias, the transition-transversion bias and the pressure to lower CpG content.


Subject(s)
COVID-19
SELECTION OF CITATIONS
SEARCH DETAIL